Method for dynamically adjusting the spectral content of an audio signal

ABSTRACT

A method for dynamically adjusting the spectral content of an audio signal, which increases the harmonic content of said audio signal, said method comprising translating an encoded digital signal into data bands, creating a psychoacoustic model to identify sections of said data bands that are deficient in harmonic quality, analyzing the fundamental frequency and amplitude of said harmonically deficient data bands, creating additional higher order harmonics for said harmonically deficient data bands, adding said higher order harmonics back to said encoded digital signal to form a newly enhanced signal, inverse filtering said newly enhanced signal, and converting said inverse filtered signal to an analog waveform for consumption by the listener.

This application is a continuation of and claims the benefit of U.S.Utility application Ser. No. 13/037,207, now issued as U.S. Pat. No.8,687,818, filed Feb. 28, 2011, which is a continuation of U.S. Utilityapplication Ser. No. 11/708,452, filed Feb. 20, 2007, which claimsbenefit of and priority to U.S. Provisional Patent Application No.60/794,293, filed Apr. 22, 2006, and also which is acontinuation-in-part application of U.S. Ser. No. 11/633,908, filed Dec.5, 2006, which claims benefit of and priority to U.S. Provisional PatentApplication No. 60/794,293, filed Apr. 22, 2006. The specification,figures and complete disclosures of U.S. Provisional Patent ApplicationNo. 60/794,293 and U.S. Utility application Ser. Nos. 11/633,908;11/653,510; 11/708,452; and 13/037,207 are incorporated herein byspecific reference for all purposes.

FIELD OF INVENTION

The present invention relates to a method for dynamically adjusting thespectral content of a digital audio signal wherein significantprocessing is performed to modify a signal's harmonic content.

BACKGROUND OF THE INVENTION

Much audio is stored, distributed and processed in the digital domain.Regardless of this fact, the audio must ultimately be converted back toanalog in order to be used. Many audio purists resist the digitizationof audio, preferring pure analog sources such as LP recordings, whichoriginate from analog master tapes. This is because of inherent defectsin what are termed “lossy compression” and “lossless compression” inaudio data compression. In both lossy and lossless compression,information redundancy is reduced, using methods such as coding, patternrecognition and linear prediction to reduce the amount of informationused to describe the data. The idea behind lossy audio compression wasto use psychoacoustics to recognize that not all data in an audio streamcan be perceived by the human auditory system. Most lossy compressionreduces perceptual redundancy by first identifying sounds which areconsidered perceptually insignificant. Typical examples include highfrequencies, or sounds that occur at the same time as other loudersound, which are coded with decreased accuracy or not coded at all.

However, reducing perceptual redundancy often does not achievesufficient compression for a particular application and requires furtherlossy compression with a difference in quality that is more readilyperceived by the user. While the data reduction is again guided by somemodel of how important the sound is as perceived by the human ear, withthe goal of efficiency and optimized quality for the target data rate,the use of lossy compression may result in a perceived reduction of theaudio quality that ranges from none to severe.

Currently, data removed during lossy compression cannot be recovered bydecompression. Additionally, audio quality is affected when a file isdecompressed and recompressed (generational losses) which makes lossycompression unsuitable for storing the intermediate results inprofessional audio engineering applications but makes it very popularwith end users (particularly MP3) since a megabyte can store almost aminute's worth of music at adequate quality.

Timbre or tone color is known in psychoacoustics as sound quality orsound color. Timbre has been called “the psychoacoustician'smultidimensional wastebasket category” as it can denote many apparentlyunrelated aspects of sound. McAdams, S., and Bregman, A. “HearingMusical Streams,” Comput. Music J. It should be pointed out that theaddition or restoration of harmonics will have the effect of sharpeningthe rise of the leading edge of transient signals, this is analogous toedge enhancement in video. It has been observed that the rendering ofthe leading edge of transient signals is a key element in the perceptionof tone color or timbre and in the rapid identification of sounds. Thusrestoring the harmonics lost to audio compression also serves to restoretimbre resulting in a higher quality listening experience.

While this method is obviously useful for compressed digital audiosignals, it is also useful to enhance non-compressed digital audiosignals. This will result in a richer timbre or tone color to the audiosignal and an enhanced listening experience.

SUMMARY OF THE INVENTION

The present invention seeks to restore the perceptual and emotionalelements lost to technical process of audio processing. The presentinvention uses a psychoacoustic model to translate an encoded digitalsignal into data bands that are analyzed for harmonic significance. Afrequency analysis then is performed and sections of sound that aredeficient in harmonic quality are identified. The sections are analyzedfor their fundamental frequency and amplitude. Additional signals ofhigher order harmonics for the sections are created and the higher orderharmonics are added back to coded signal to form a newly enhanced signalwhich is inverse filtered and converted to an analog waveform forconsumption by the listener.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 represents a block diagram of the audio enhancement process.

FIG. 2 shows a block diagram of the memory elements of proposed harmonicenhancement process.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

Common digital audio standards such as MPEG-1 (Layers I-III), MPEG-2,Microsoft Windows Media audio, PAC, ATRAC, and others use a variety ofencoding techniques to quantize and produce digital representations ofanalog acoustic sources. The sampling and encoding of audio is performedaccording to complex psychoacoustic models of human auditory perceptionin conjunction with data reduction schemes to produce a coded audiosignal which can be decoded with less sophisticated circuitry to producea stereophonic audio signal. Limitations bandwidth and bit raterequirements for the storage and transmission of digital data dictatethe use inherently lossy coding algorithms. The purpose of thepsychoacoustic model is to take advantage of the fact that the humanauditory system can detect sound information up to certain thresholdsand the presence of certain sounds can influence the ability of thebrain to detect and perceive other sounds. The overall amount of datacan be reduced by not encoding the audio signals that would be maskedfrom the perception of the listener. For this reason, this family ofencoding schemes is referred to as perceptual encoding.

Perceptual coding commonly works by separating an incoming audio signalinto groups of bands that are compared to the psychoacoustic model.Those signals that are above the auditory threshold are quantized andpassed through the encoding chain. The signals below the maskingthreshold are discarded, and all information from those samples isdestroyed. The net effect is a final audio signal that is representativeof the original analog source but that is inherently incomplete. Somethe information that is lost in the perceptual coding processes is thesome of the most important information necessary to retain the richnessof the original analog recording. One of the major reasons for theeffect is that fact that most psychoacoustic models are created andtested using static, non-organic sounds such as steady sinusoidal tones.The tones are produced at varying amplitudes and frequencies todetermine the clinical ranges of human audio perception. Models,however, do not incorporate the complex and often unpredictable responseof the ear to complex changing stimuli such as musical recordings whichincorporate the perception of several layers of harmonics. The resultingdigital signals are often described as being technically precise, butlacking in perceptual depth.

The present invention is designed to enhance a pre-produced digitalaudio signal to produce a more musically convincing product for thelistener. The digital damage done to the audio signal in the form ofquantization noise, and the information lost during the originalrecording encoding cannot be directly recovered during the decodingprocess. It is therefore necessary to create a set of processingtechniques and algorithms that will work in conjunction with previouslyestablished decoding standards to produce a new enhanced output signal.

The DSP implementation, as shown in FIG. 1, involves the use of aharmonic analyzer to examine the existing encoded data. In order tominimize the amount of digital noise from further data conversions, theencoded data is reevaluated after the audio stream has passed throughthe demultiplexing and error checking processes of the decoder. Thesubbands of digital data are windowed and scaled at values appropriatefor the harmonic analysis. A filterbank is applied to the newlyreconstructed bands of data, and an enhanced audio signal is created.

The psychoacoustic analyzer dynamically examines the decoded sub bandsof data with adaptive sample windowing to account for the differences inwindow size necessary to accurately detect transient audio informationand frequency dependent audio information. A buffer, as shown in FIG. 2,is used to store sequential window information for dynamic analysis. Ineach sample window, the fundamental frequency of the incoming signal isdetermined and a series of supplementary signals is created at multiplesof the detected fundamental frequency. The supplementary signals havedecreasingly large amplitudes as they are created. The original signaland the artificially created harmonic implements are merged together andplaced in a buffer for distribution to inverse filterbanks for the finalcreation of the analog output signal.

The psychoacoustic model used in the harmonic analysis is designed basedupon the responsiveness of the human ear to harmonic stimulation. Forthe sake of audio reproduction, the preferred embodiment of the newpsychoacoustic model is to use musical influences as the test andeffectiveness criteria for the design. In this psychoacoustic modelinstead of using static, non-organic sounds such as steady sinusoidaltones, the complexity of musical influences are used and wouldincorporate several layers of harmonics

Thus, it should be understood that the embodiments and examplesdescribed herein have been chosen and described in order to bestillustrate the principles of the invention and its practicalapplications to thereby enable one of ordinary skill in the art to bestutilize the invention in various embodiments and with variousmodifications as are suited for particular uses contemplated. Eventhough specific embodiments of this invention have been described, theyare not to be taken as exhaustive. There are several variations thatwill be apparent to those skilled in the art.

What is claimed is:
 1. A method for modifying the spectral content of anaudio signal, comprising the steps of: identifying sections of an audiosignal that are deficient in harmonic quality; adding higher orderharmonics into said audio signal to form an enhanced signal; and inversefiltering said enhanced signal.
 2. The method of claim 1, wherein saidaudio signal is an encoded digital signal.
 3. The method of claim 1,wherein the step of identifying includes the creation of apsychoacoustic model.
 4. The method of claim 4, wherein the audio signalis first translated into data bands, and the psychoacoustic modelidentifies sections of the data bands that are deficient in harmonicquality.
 5. The method of claim 4, wherein the fundamental frequency andamplitude of the harmonically deficient data bands are analyzed prior tocreating additional higher order harmonics for the harmonicallydeficient data bands.
 6. The method of claim 1, wherein theinverse-filtered enhanced signal is a digital signal.
 7. The method ofclaim 6, further comprising the step of converting the inverse-filteredenhanced digital signal to an analog waveform.
 8. The method of claim 4,wherein said psychoacoustic model incorporates several layers ofharmonics to identify said deficient data bands.